Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Analysis of Bi-directional Reranking Model for Uyghur-Chinese Neural Machine Translation
ZHANG Xinlu, LI Xiao, YANG Yating, WANG Lei, DONG Rui
Acta Scientiarum Naturalium Universitatis Pekinensis    2020, 56 (1): 31-38.   DOI: 10.13209/j.0479-8023.2019.093
Abstract1291)   HTML    PDF(pc) (899KB)(188)       Save
The fitting training of neural machine translation is easy to fall into a local optimal solution on a lowresource corpus such as Uyghur to Chinese, resulting in the translation result of a single model may not be a global optimal solution. In order to solve this problem, the probability distribution predicted by multiple models is effectively integrated through the ensemble strategy, and multiple translation models are taken as a whole. At the same time, the translation models with opposite decoding directions are integrated by the reordering method based on cross entropy, and the candidate translation with the highest comprehensive score is selected as the output. The experiment on CWMT2015 Uighur-Chinese parallel corpus shows that proposed method has 4.82 BLEU values improvement compared with a single transformer model.
Related Articles | Metrics | Comments0
Collaborative Analysis of Uyghur Morphology Based on Character Level
Turghun Osman, YANG Yating, Eziz Tursun, CHENG Li
Acta Scientiarum Naturalium Universitatis Pekinensis    2019, 55 (1): 47-54.   DOI: 10.13209/j.0479-8023.2018.067
Abstract953)   HTML    PDF(pc) (1060KB)(228)       Save

The Uyghur language has various inflectional affixes, complex structures and phonetic changes. The authors propose a collaborative analysis method for Uyghur morphology at character level. It includes three procedures: morpheme segmentation, morphological annotation and reduction of phonetic changes. The main characteristics of this method is to use a composite tag to represent the morpheme boundaries, annotations and phonetic changes. In addition, character sequence annotation is used to train the model. Experimental results show that the accurency of morpheme segmentation, morphological annotation and reduction of phonetic reaches 96.39%, 92.78% and 99.79% respectively. The overall accuracy of the system reaches 92.59%.

Related Articles | Metrics | Comments0